Improving Data Integration through Disambiguation Techniques
نویسنده
چکیده
In this paper Word Sense Disambiguation (WSD) issue in the context of data integration is outlined and an Approximate Word Sense Disambiguation approach (AWSD) is proposed for the automatic lexical annotation of structured and semi-structured data sources.
منابع مشابه
Improving the Impact of Subjectivity Word Sense Disambiguation on Contextual Opinion Analysis
Subjectivity word sense disambiguation (SWSD) is automatically determining which word instances in a corpus are being used with subjective senses, and which are being used with objective senses. SWSD has been shown to improve the performance of contextual opinion analysis, but only on a small scale and using manually developed integration rules. In this paper, we scale up the integration of SWS...
متن کاملRefining the most frequent sense baseline
We refine the most frequent sense baseline for word sense disambiguation using a number of novel word sense disambiguation techniques. Evaluating on the S-3 English all words task, our combined system focuses on improving every stage of word sense disambiguation: starting with the lemmatization and part of speech tags used, through the accuracy of the most frequent sense baseline, to hig...
متن کاملData-Intensive Question Answering
INTRODUCTION Data-driven methods have proven to be powerful techniques for natural language processing. It is still unclear to what extent this success can be attributed to specific techniques, versus simply the data itself. For example, in [Banko and Brill 2001] it was demonstrated that for confusion set disambiguation, a prototypical disambiguation-instring-context problem, the amount of data...
متن کاملOptimization of Word Sense Disambiguation Using Clustering in Weka
In the Natural Language Processing (NLP) community, Word Sense Disambiguation (WSD) has been described as the task which selects the appropriate meaning (sense) to a given word in a text or discourse where this meaning is distinguishable from other senses potentially attributable to that word. These senses could be seen as the target labels of a classification problem. Clustering and classifica...
متن کاملUsing co-occurrence tendencies to improve Cross-Language Information Retrieval
Query disambiguation is considered as one of the most important methods in improving the effectiveness of information retrieval. In the present paper, we focus on query terms disambiguation via, a combined statistical method both before and after translation, in order to avoid source language ambiguity as well as incorrect selection of target translations. By combining query expansion with dict...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008